 Assembler course part 1

 by Wanja Gayk

 translated by Kendra Thiemann
 revised by Nate Dannenberg

 Assembly. Machine Language. For many people these are still
 riddles without solutions. But many experienced programers
 swear that Assembler is THE language to use, not to mention
 being easier and more flexible than Basic. But why? Well,
 first of all Assembler-instructions are very small and don't
 actually do much by themselves. However, their combined
 effect can be quite impressive. To understand these
 instructions, we first have to learn to count in binary and
 hexadecimal, since most operations are easier to understand
 when expressed in this form.

 BITS AND BYTES

 First, let's talk about bits. A bit is the smallest
 information unit and it can have one of two states: set and
 clear, or set and reset if
 you prefer. In other words, your choices are 1 and 0. Eight
 bits make one byte, which means that with a little simple
 math, we find that there are 2 to the 8th power, or 256,
 possible values.

 HEXADECIMAL AND DECIMAL

 When you count, you do so starting at zero or one, working
 your way up to ten. Take a look at the following sequence of
 numbers:

 0,1,2,3,4,5,6,7,8,9

 You will notice that to represent any number, you need only
 one digit, until you hit 10, at which point you need two
 digits to describe your number. Keep counting and you
 eventually need to add even more digits as you reach 100,
 1000 and so on.

 In Hexadecimal (or "Hex"), it works a little differently.
 Consider the following sequence:

 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

 In this sequence we have 16 values, ranging from 0 to F. But
 what do the letters A through F mean? Well, since there are
 no single digit characters in the English language that
 represent 10 through 15 in a natural way, we are forced to
 choose letters of the alphabet, and so, A through F became
 the standard. To further differentiate a decimal number from
 a Hex number, most coders use a dollar sign '$' in front of
 a Hex number.

 As with decimal, we eventually have to add another digit to
 our count as we continue to increase in value. With decimal,
 we did this at 10. In Hex, we add a digit when we hit 16.

 In decimal, you could say that a number WXYZ is W*1000 +
 X*100 + Y*10 + Z. So, the value 1234 would be 1000 + 200 +
 30 + 4 of course.

 In Hex, you would thus say that a number ABCD is A*4096 +
 B*256 + C*16 + D. As an example, $1234 (Hex) would be 1*4096
 + 2*256 + 4*16 + 4, or 4676, when expressed in decimal.

 In each case, each of the four digits in the numbers above
 is just that, a single digit, either 0 through 9 for the
 decimal system, or 0
 through F for Hexadecimal.

 As mentioned above, one byte can hold any value from 0 to
 255 ($00 to $FF). To hold larger numbers, like $1234 in the
 above example, we simply use two bytes, each storing two
 digits from the number. The byte representing the right most
 digits is called the "LSB" or "Least Significant Byte",
 while the byte representing the digits to the right is
 called the "MSB" or "Most Significant Byte."

 ASSEMBLY LANGUAGE - STEP BY STEP

 THE REGISTERS:

 In Assembly, your primary activity will be moving data back
 and forth, manipulating bits and bytes, and making
 comparisons and jumps throughout your program.

 Inside every Commodore 64 is a MOS 6510 processor, the big
 brother to the 6502 that is used in the VIC-20 and most disk
 drives. This processor has three registers that can be used
 for many purposes. They are, the accumulator, denoted here
 as ".A", the X Index Register, denoted as ".X", and the Y
 Index Register, denoted as ".Y".

 Each register is one byte in size, hence each holds a value
 from $00 to $FF (Hex).

 THE MAIN STORAGE:

 In addition to the registers, the 6510 has access to 65536
 bytes of User-Programmable memory. Of course, every
 Commodore 64 comes fully loaded with a full 64K, which is
 enough to suit almost any need. Each individual byte has an
 address within the range of $0002 to $FFFF, with the very
 first two bytes taken by the processor for it's on-board
 parallel port.

 THE FIRST COMMANDS:

 Now it is time for you to load and start a machine language
 monitor. If you have an Action-Replay, Final Cartridge,
 Action Gear, Nordic Power or anything comparable, you can
 use the command MON to jump from the basic interpreter into
 the Cartridge's internal Machine Language Monitor.

 Now lets talk about the most important commands: LDA, STA
 and JMP.

 LDA:

 LDA is an abbreviation (Mnemonic) for Load Accumulator. We
 use LDA to load a one-byte value into .A. The simplest LDA
 command is LDA #Value. As an example, LDA #$01 flat-out
 loads the value 1 into .A,

 LDA #$02 the value 2, and so on for any value $00 through
 $FF. Note that the "#" sign is required, to specify
 "immediate" mode.

 STA:

 STA is the abbreviation for Store Accumulator. With STA we
 store the contents of .A to someplace in main memory (or
 perhaps, into an I/O chip like the SID). The contents of .A
 are left unchanged after the store operation. The simplest
 STA command is STA Address. For example, STA $3000 would
 store the contents of .A into location $3000 in main memory.
 STA $0400 would store to $0400, which is the start of your
 40 column display.

 JMP:

 JMP is the abbreviation for Jump. JMP is the 6510's "GOTO"
 command. Every address of the main storage can contain data
 or programs. With JMP you order the processor to stop what
 it's doing, move to a new place in your program or perhaps
 into the Operating System, and begin executing. Normally,
 you write the JMP command as JMP $nnnn where $nnnn is a
 location in the C64's main memory.

 THE FIRST SMALL PROGRAM:

 For our first small program we only need the mentioned 3
 commands and 2 important storage locations you should keep
 in mind: $D020 and $D021. $D020 is the control byte for the
 screen's border color, while $D021 controls the background
 color of the text-portion of the screen.

 And now, on to the ML Monitor. Below is a display typical of
 what you will see when you start your ML Monitor (either
 with "mon", "m or shift-N" on the C128 in C128 Native Mode,
 or by using a menu within your utility cartridge)

 MON
 B*
 ADDR AR XR YR SP 01 NV-BDIZC
 .; FFFF 00 00 00 F8 37 00000010

 The first line means "Address" (where the computer was
 executing at when the monitor was called), .A .X and .Y
 registers, Stack Pointer, the value of location $0001, and
 the values of the 6510's various status flags (more on these
 last three items later)

 Let's try our first program. Try the following few lines of
 code. Enter each line without the leading period (your ML
 Monitor will usually put it there for you), and press return
 at the end of each line. Depending on the ML Monitor you are
 using, the line may either be accepted as-is, corrected in
 some way, or altered to include such information as the
 hexadecinal values that make up the code you've entered.

 As you enter each line, the ML Monitor will print the
 address of the next instruction and position the cursor to
 the right of that address, sort of like an
 "auto-line-number" feature.

 .A 2000 LDA #$00

 .A 2002 STA $D020

 .A 2005 STA $D021

 .A 2008 LDA #$01

 .A 200A STA $D020

 .A 200D STA $D021

 .A 2010 JMP $2000

 .A 2013 (just press Return)

 What this program does:
 2000 Load .A with the value $00
 2002 Write contents of .A to the VIC chip's Border Color
 Register (#$00 was loaded into .A on the previous line, so
 this turns the border black)
 2005 Write contents of .A (still #$00) to the VIC chip's
 Background Color Register. This turns the background black
 as well.
 2008 Load .A with the value $01
 200A Write contents of .A to the Border Color Register.
 Since we just loaded #$01 into .A, the border will now turn
 white.
 200D Write contents of .A to the Background Color Register
 (turns the background white)
 2010 Jump to memory location $2000 and continue. Since we
 are JMP'ing back to the beginning of the program, we created
 an "infinite" loop.

 You may run this program by entering the command G 2000 at
 the next available ML Monitor prompt (if it produces one,
 usually a ".")

 AND IN BASIC?

 This is how the program might look if written in BASIC:

 10 poke 53280,0
 20 poke 53281,0
 30 poke 53280,1
 40 poke 53281,1
 50 goto 10

 In both cases we simply make the border and background
 colors flicker wildly (black to white, over and over).
 You'll notice the BASIC version runs considerable slower, as
 the screen will fill with stripes instead of thin, broken
 lines.

 The Basic-program does look smaller, doesn't it? Actually,
 it's larger. The custom crafted machine code takes a mere 19
 bytes of space (from $2000 to $2012), while the BASIC
 version hogs a whopping 52 bytes! Part of the reason for
 this is that the numbers 53280 and 53281 are actually being
 spelled out byte for byte in the program, while the numbers
 $D020 and $D021 in our ML example are being stored as binary
 numbers, taking only two bytes each.

 In addition, BASIC is full of things like "line links" and
 line number values. All of these generally make BASIC slow
 and bloated in comparison.

 CONCLUSION

 As you can see, Machine Language really isn't all that
 complex. Just as learning to program in BASIC seemed
 complicated at first, you simply have to break the ice and
 start with something small. Once you've gotten your feet
 wet, you'll see it's really pretty easy to learn.

 For those who want to get into Machine Language now, without
 waiting for future articles and hints, at least start by
 picking up a pocket calculator that features Hexadecimal and
 Binary conversion keys. Some calculators in the Casio FX
 series feature these, and they are quite handy.
